Online Adaptation in Learning Classifier Systems: Stream Data Mining
نویسندگان
چکیده
In data mining, concept drift refers to the phenomenon that the underlying model (or concept) is changing over time. The aim of this paper is twofold. First, we propose a fundamental characterization and quantification of different types of concept drift. The proposed theory enables a rigorous investigation of learning system performance on streamed data. In particular , we investigate the impact of different amounts and types of concept drift on evolutionary classification systems focusing on the learning classifier system approach. We compare performance of one Pittsburgh-type system, GAssist, which learns in batch mode using windowing techniques, with a Michigan-type system, XCS, which is a natural online learner. The results show that both systems are able to handle the various concept drifts well. Behavioral differences are discussed revealing task dependencies, representation dependencies as well as dynamics dependencies. Discussions and conclusions outline the path towards more detailed measures for problem dynamics in the data mining realm.
منابع مشابه
Online Passive Aggressive Active Learning and Its Applications
We investigate online active learning techniques for classification tasks in data stream mining applications. Unlike traditional learning approaches (either batch or online learning) that often require to request the class label of each incoming instance, online active learning queries only a subset of informative incoming instances to update the classification model, which aims to maximize cla...
متن کاملA Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis
Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...
متن کاملSentiment Classification over Opinionated Data Streams Through Informed Model Adaptation
Opinionated data streams are very popular data paradigms nowadays as more and more users share their opinions online about almost everything from products to persons, brands and ideas. One of the key challenges for opinionated stream mining is dealing with concept drifts in the underlying stream population by building learners that adapt to such concept changes. Ageing is a typical way of adapt...
متن کاملDetecting Concept Drift in Data Stream Using Semi-Supervised Classification
Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...
متن کاملA Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis
Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...
متن کامل